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Abstract. Let V denote a set of N vertices. To construct a hypergraph process, 
create a new hyperedge at each event time of a Poisson process; the cardinahty 
K of this hyperedge is random, with generating function p{x) =^ ^pfcx'", where 
P{K = k) — pk', given K = k, the k vertices appearing in the new hyperedge are 
selected uniformly at random from V. Assume pi + p2 > 0. Hyperedges of cardi- 
nality 1 are called patches, and serve as a way of selecting root vertices. Identifiable 
vertices are those which are reachable from these root vertices, in a strong sense 
which generalizes the notion of graph component. Hyperedges are called identifi- 
able if all of their vertices are identifiable. We use "fluid limit" scaling: hyperedges 
arrive at rate N, and we study structures of size 0(1) and 0{N). After division by 
N, numbers of identifiable vertices and hyperedges exhibit phase transitions, which 
may be continuous or discontinuous depending on the shape of the structure func- 
tion — log(l — x)/p'{x), X G (0, 1). Both the case pi > 0, and the case pi — < P2 
are considered; for the latter, a single extraneous patch is added to mark the root 
vertex. 
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1. Introduction 

The k-core of a graph is the largest subgraph with minimum degree at least k. 



Pittel et al. (1996| ) study the following algorithm for finding the 2-core of a graph: 



1 . If vertices of degree one exist, select one and remove the edge incident to it. This 
may cause the degree of other vertices to drop. 

2 . If there are no degree one vertices remaining, stop. 

3. Repeat. 

The graph obtained at the conclusion of this algorithm is the 2-core. 

This algorithm is a special case of another, run on hypergraphs, called hypergraph 
collapse and first studied in [Darling and Norris (2004| . By a hypergraph we shall 
mean a map A : 2^ {0,1,2,...}, where is a finite set of vertices and 2^ 
is the set of subsets of V. It will sometimes be helpful to think in terms of an 
edge-labelling of A, which is a choice of a set / and a map e : I 2^ such that 
A{A) = |{i G / : e{i) = A}\ for all A. Thus e describes a set of labelled subsets of V, 
which we call hyperedges and then A gives the number of hyperedges at each subset 
of V . Hyperedges of unit cardinality are called patches. Hypergraph collapse is the 
following algorithm: 

1 . If a patch exists, select one and remove it together with the unique vertex v it 
contains. This will cause any other hyperedge e(i) containing v to be replaced by 
eii) \ {v}. 

2 . If there are no patches remaining, stop. 

3. Repeat. 

Although we have described the algorithm in terms of an edge-labelled hypergraph, 
the possible moves for A do not depend on the edge-labelling chosen. The vertices 
which are removed by hypergraph collapse are called identifiable, and hyperedges 
which contain only identifiable vertices are also called identifiable. These definitions 
do not depend on the order in which patches are chosen during hypergraph collapse; 
Darling and Norris (2004| ). 



see 



The core-finding algorithm of Pittel et al. (1996| ) is hypergraph collapse applied to 



the dual hypergraph. To obtain the dual, note that we can think of e as a subset of 
V X I. The roles of V and / are now symmetric, so e also corresponds to an edge- 
labelling of a hypergraph A' in which the status of vertices and hyperedges is reversed. 
Vertices (resp. hyperedges) of A not in the core correspond to identifiable hyperedges 
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(resp. identifiable vertices) of A'. More information about graph cores can be found 
in [Fountoulakis (2002| ), and hypergraph cores are considered by Cooper (2002| ). 

The identifiable vertices obtained by hypergraph collapse also serve to generalize 
to hypergraphs the definition of graph component. A graph is a hypergraph having 
edges only of cardinality two, and consequently has no patches. However, if the single 
hyperedge {v} is added to the graph [making it a hypergraph], then the identifiable 
vertices obtained by running hypergraph collapse on the augmented graph are exactly 
the vertices in the graph component containing v. The identifiable edges are all the 
edges of this graph component. 

This motivates the following definition for patch-free hypergraphs: A vertex is in 
the domain of v if it is in the set of identifiable vertices when the hypergraph is 
augmented by the addition of the hyperedge {v}. 

The purpose of this paper is to study the time-evolution of the set of identi- 
fiable vertices and the set of identifiable edges in a Poisson hypergraph process, 
which is a hypergraph-valued, continuous-time stochastic process. The vertex set 
is = {1, 2, ... , A^}, and the process depends on parameters {pj}^^^. Attached to 
each subset A of is a Poisson clock run at rate A^p|A|/(|^|) , and these clocks are 
independent of one another. [Here 1^41 denotes the cardinality of A] When the clock 
associated to A "rings", a new hyperedge equal to A is added to the hypergraph. 
The overall rate at which hyperedges of cardinality k are added is then N pk- We will 
call this process the Poisson(p) hypergraph process. This is a generalization of the 
ordinary random graph process, in which edges form between each pair of vertices 
independently at a fixed rate. 

While for A^ fixed, this process depends only on the finite sequence {pk}k=ii we 
will be interested in the asymptotic behavior as N ^ oo, so we will assume that 
always the infinite sequence {pk^kLi is given. Moreover, this sequence is required 
to be a probability distribution on {1,2, . . .} with finite expectation and satisfying 
Pi + p2> 0. The generating function x t— > Yl^=i Pk^^ 'wiU be denoted by p. 

In [Darling and Norris (2004| ), the Poisson ((3 ) random hypergraph is defined, where 
{(3k} is a sequence of positive real numbers. This is a random hypergraph with 
vertex set V = {1, 2, ... , A^}, so that for each A G V, the number of occurrences of 
the hyperedge A is a Poisson random variable with expectation N(3\a\/ (|a|)' these 
random variables are independent for different subsets of V. If {Ai}f>o is a Poisson(p) 
hypergraph process, then for fixed t > 0, A^ is a Poisson(tp) random hypergraph. 
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We separate out two distinct cases in the study of Poisson hypergraph processes, 
depending on whether pi > or pi = 0. When pi = 0, the hypergraph never acquires 
patches, and provided the initial hypergraph is patch-free, the set of identifiable 
vertices is forever void. As the previous discussion of ordinary graphs suggests, it is 
natural to consider in such cases the set of vertices in the domain of a distinguished 
vertex. 

We discuss now the case pi > 0. Our first result describes the evolution of the 
rescaled number of identifiable vertices and hyperedges in the Poisson(p) hypergraph 
process {At}t>o- Let 

I identifiable vertices in A+| 



(1) 



* " N 

I identifiable hyperedges in A^ 
^* = A^ 



The structure function t, defined as 

2 t{x) = ^ — ^, a; €0,1, 

plays a central role for hypergraph processes. [Recall that p{x) = YlkLi Pk^'' ■] Typi- 
cally if: is not invertible, but there is a right-continuous monotonic function called the 

lower envelope: 

(3) g{s) = mi{x E (0, 1) : t{x) > s} , s>0. 
Also important for hypergraph processes is the upper envelope: 

(4) ^^(s) = sup{x G (0, 1) : t{x) < s} V , s > . 

We classify structure functions into three types: graph-like, bicritical, and excep- 
tional. This taxonomy is given in Table ^ Figure ^ shows a bicritical structure 
function and the corresponding lower envelope. 

Let S C M+ denote the discontinuity set of g: 

(5) E''^{s>0: g{s-)^g{s)}, 

where g{s-) limf^sgit). 

For s G S, both g{s~) and g{s) are zeros of the function x p'{x) + log(l — x). 
For the sake of simplicity of exposition, we shall assume below that there are never 
any zeros of this function strictly between g{s—) and g{s): 

(6) {x : sp{x) + \og{l - x) = 0}f]{g{s-),g{s)) = , for all s G S . 
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Type 


Description 


Example of p{x) 


graph-like 


t is strictly increasing, and 
g and g* are continuous. 


cubic with Sps < p2 


bicritical 


g and g* each have 
exactly one discontinuity. 


cubic with Sps > p2 


exceptional 


g or g* has two or more 
discontinuities. 


1000 



Table 1. Classification of structure functions 



g(s) 




s 



Figure 1. Left: Bicritical structure function, with t{x) on the hori- 
zontal axis, corresponding to a quartic polynomial p{x) with < pi < 
P2 < Pi < Pi- Right: Lower envelope, showing the single discontinuity. 



Also assume that S has no accumulation points. This is true, for example, if 

Let {Bg, s G H} denote a collection of independent Bernoulli (1/2) random vari- 
ables, indexed by the discontinuity set Define 

f/^ g{t-) + Btigit) - git-)) , teE 
T/^g{t),t^E. 

In other words, at each point of discontinuity we choose the left limit or the right 
limit of g according to the fiip of a fair coin. Finally, let 

(8) Zt = tpift) - (1 - ft) log(l - ft) . 
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For a sequence of stochastic processes {X^}|^^^, where = {X^}t>o, and a 
stochastic process X = {Xt}t>o, we write X if the finite-dimensional dis- 

tributions of converge to those of X. For a sequence of random variables (or 
vectors) {X^}, we write X^ -—>■ X to indicate that X^ converges in distribution to 
X. 



Theorem 1. Consider a Poisson hypergraph process such that pi > 0, and suppose 
© holds. As N ^ oo, 

(9) {(rf,Zf)},>o'-^-{(T„Z,)h>o. 
Furthermore for any compact interval I C [0, oo) \ S, 

(10) sup (T,^, ) - igit)M9it)) - [1 - 9it)] log(l - git))) 
tei 

in probability as N oo. 

We now turn to the case of patch-free hypergraph processes, i.e. the regime where 
pi = < p2- In this case g{s) = for all s G [0, (2p2)~^)- There are three possibilities 
for the behavior of g at (2p2)~^, enumerated in Table El 



Sub-case of pi = < p2 


Behavior of g 


3p3 < P2 


g is continuous at (2p2)^^, 
and right derivative is finite 


3p3 = P2 


P4, ps, . . . determine whether 
g is continuous at (2p2)~^ 



3p3 > p2 g is discontinuous at (2p2) ^ 

Table 2. The = pi < p2 regime. 



For simplicity, we focus on the case where g has a single discontinuity, located at 
(2p2)~^; i.e. S = {(2p2)^^}. The general case follows the same pattern as Theorem Q 
because after the number of identifiable vertices has reached 0{N), the subsequent 
evolution is much the same as the pi > case. 

In the pi = and p2 > regime, another structure function besides Q comes into 
play, namely the structure function t2 of the graph which results from discarding all 
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hyperedges of cardinality more than two: 

(11) t2[x) = , xG(0, 1). 

Since t2 is monotonic, the corresponding lower envelope g2 defined as 

(12) ^2(s) = inf{x G (0,1) : t2(a;) > s}, s>0, 

is continuous. As before, g2{,s) = for < s < (2p2)"^, and g2{,s) ^ 1 as s ^ oo; it 
describes the asymptotic proportion of vertices in the giant component of a random 
graph where the ratio of edges to vertices is sp2- 

We will construct in Section [7| an increasing process {Mj} so that the distribution 
of Mt is 



(13) P{Mt = n) 



-2/52" (2tp2ri)"~V^! ifneN, 
if n = oo 



where ipt is the largest solution x in [0, 1] of 2tp2a^ + log(l — x) =0. [Notice that 
= for 2tp2 < 1, and < < 1 otherwise.] 

Write for the number of vertices in the domain of Vq in A^, and write 
for the number of hyperedges identifiable from vq in Aj. Set T/^ =^ N~^Tl^ and 
2N drf fq-iz^. Also, define 

Tt = 5'(^)i{Mt=oo} ; 

Zt = {tp{g{t)) - [1 - g{t)] log(l - g{t))} l|M,=oo} . 

Theorem 2. Consider a Poisson hypergraph process such that pi = < p2, and 
suppose g has a single discontinuity located at (2p2)~^. Fix a distinguished vertex vq. 
The number of vertices in the domain of vq, and number of hyperedges identifiable 
from Vq, obey the following limits in distribution as N ^ oo: 

(14) {(T,^, Z^)}t>o converges weakly m D ([0, oo), (N U {oo})^) to {{Mt, Mt)}t>Q , 
where we adjoin oo to N as a compactifying point. Also 

(15) {(ff,Zf)},>o'-^-{(f„Z,)},>o. 

Remark 1.1. Observe the difference between the limit law {{Tt, Zt)}t>o in (fT^ and 
the limit law {{Tt, Zt)}t>o in Q: Tt conforms to the deterministic lower envelope g{t), 
except at points in the finite discontinuity set, whereas Tt waits until the random time 



8 R.W.R. DARLING, D.A. LEVIN, AND J.R. NORRIS 

X *== infjt > : Mt = oo}, with distribution function g2{t), before jumping from 
up to g{t). 

Remark 1.2. See Remark 15.11 as to whether the convergence ()15|) extends to weak 
convergence in the Skorohod space /^([O, oo), M^). 

The rest of this paper is organized as follows: Some definitions concerning hyper- 
graphs are given in Section |21 We establish that certain key processes are Markov 
in Section El The case of hypergraphs and hypergraph processes with patches are 
treated in Section El and Section respectively. Theorem m is proved in Section 
Patch-free random hypergraphs and hypergraph processes are treated in Section El 
and Section [71 respectively. Theorem 2 is proved in Section [71 Finally, we mention 
future directions in Section |H1 

2. Hypergraph definitions 

Recall from the Introduction that the identifiable vertices are those vertices re- 
moved by the hypergraph collapse algorithm described there, and the identifiable 
hyperedges are those hyperedges consisting only of identifiable vertices. 

Given a hypergraph A and a subset S G V, denotes the hypergraph after all 
vertices in S are deleted. More precisely, 

(16) A'iA)'^' Yl ^(^)' A(^y\s- 

BZiA, B\S=A 

We now more exactly specify the hypergraph collapse algorithm: select if possible a 
vertex v with A({i;}) > 1; replace V hj V \ {v} and A by A'^"^; then repeat. When 
the algorithm terminates, we obtain a set V* consisting of the identifiable vertices, 
and a patch-free hypergraph A^ on V \V*. 

Suppose A is a patch-free hypergraph, and thus having no identifiable vertices. 
Given such a hypergraph A and a distinguished vertex vq, we say that v is in the 
domain of vq in A if f is identifiable in the hypergraph A -|- l{vo} obtained by aug- 
menting A by the hyperedge {vq}. A hyperedge is said to be identifiable from vo if it 
is identifiable in A + 

Warning: For a general patch-free hypergraph, it is possible for vertex u to be in 
the domain of v, while v is not in the domain of u, although this cannot happen in 
graphs; see Figure |2l 
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Figure 2. Adding a patch on v makes u identifiable, but not vice 
versa. 

3. POISSON HYPERGRAPH PROCESSES: MARKOV PROPERTIES 

For {(3k}T=i ^ sequence of non-negative numbers, a Poisson((5) random hypergraph 
is a random hypergraph A with vertex set V — {1, ... , N} so that ior A cV, 

(i) the random variable A(A) has a Poisson distribution with mean A^/3|A|/(|^|), 
and 

(ii) {A(A) : C is a collection of independent random variables. 

In what follows, {pfcl^Li will be a probability distribution on the positive integers 
which has finite mean and 



Wc now give an explicit construction of the hypergraph-vahicd stochastic process 
described in the introduction. Let Ki, K2, ... be a sequence of independent random 
variables in {1, 2, 3, . . .} with common distribution P{Kn = k) = p^, for all n, /c G N. 
Denote by Ai, A2, ... a sequence of independent random subsets of V, such that An is 
chosen uniformly at random from the subsets of V of size Kn whenever Kn < N; the 
set An is not defined when Kn > N. Let {Et}t>o be a Poisson process, run at rate 
N, having arrival times ti, T2, . . .. Define a stochastic process {^t}t>Q with values in 
the set of hypergraphs with vertex set V by 



Interpret At(^) as the number of occurrences of hyperedge A by time t. In summary, 
for each A cV, 



(17) 



Pi + P2 > . 



n:Tn<t 



(18) 



{At{A)} is a Poisson process of rate N 



P\A\ 
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and all these Poisson processes are independent. We call {At}t>o a Poisson(p) hy- 
pergraph process, where p denotes the generating function 

(19) p(x)'='5^p,r'= 



k>l 



The finite mean assumption is equivalent to p'(l) < oo. For fixed t > 0, Aj is a 
Poisson(tp) random hypergraph. 

Whereas the hypergraph literature has tended to concentrate on the "fc-uniform" 
case (i.e. p^ = 1 for some k), we find the superposition of /c-uniform random hy- 
pergraphs for various different values of k can be handled without special effort, 
and leads to asymptotic properties absent from the /c-uniform case. Moreover the 
Poisson structure simplifies our arguments, for example by allowing some summary 
statistics of {At}t>o to be Markov processes in their own right: see Proposition 13.11 
Poissonization is, of course, a well-established procedure - see lAldous (1989| ). 



Previous literature has also concentrated on the case A < 1. We now sketch a way 
to deduce from our results for a Poisson(/5) random hypergraph A some corresponding 
results for A A 1. We note moreover that if p^ = 1 for some k then A A 1 is exactly 
a /c-uniform hypergraph. The set of identifiable vertices is the same for A and A A 1 
but A may have additional identifiable hyperedges. First consider patches. Throwing 
a Poisson(A^/3) number of balls (i.e. patches) uniformly at random into N urns yields 
a Binomial(iV,l — e~^'^ ) number of occupied urns (i.e. vertices covered by at least one 
patch). Hence the number of patches in A, less the number in A A 1, divided by A^, 
has limit in probability (3i + e~^'^ — 1. On the other hand, the expected number of 
subsets of size at least 2 receiving at least 2 hyperedges is bounded uniformly in A^. 
Hence, after rescaling by A^~^, only the extra patches in A can contribute in the limit 
and of course all of these do so. 

Proposition 3.1. Let Tt and Zt denote the numbers of identifiable vertices and iden- 
tifiable hyperedges for At. Both {Tt}t>o and {(Tt, Zt)}t>o are Markov processes. The 
number of non-identifiable hyperedges in At, given that Tt = m, is conditionally Pois- 
son, with mean 



(20) Nt 



\ (r)+(^- ^)a-i) 



k>l \k) 

When m — N'y = o{N), for 7 G [0, 1], this reduces as N 00 to 
(21) iVt[l-p(7)-(l-7)p'(7)] + o(A^). 
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Remark 3.1. Because the total number of hyperedges in At is Poisson(A^)f:), Propo- 
sition 13.11 reduces the study of hmits of identifiable hyperedges to study of limits of 
identifiable vertices. In particular, if N^^Tt converges in distribution as oo to 

a random variable T^, then necessarily 

p(ro + (1 - Tt)p'm) • 

Remark 3.2. It is easy to identify the generator of {Tt}t>o, rescale by division by iV, 
and take a limit on any compact interval / C M+ \S (see (j^I)); however this approach 
did not lead to a proof of Theorem [H because of the difficulty of passing through 
discontinuous phase transitions. 

To prepare for the proof, some measure-theoretic apparatus is needed. Let (fi, JF, P) 
be the probability space on which the process {A(}t>o is defined. For any set S dV , 
and any t > 0, define the cx-field C as 

^t= V ^i^^(^) ■ i^\^i<i}- 

0<s<i 

Let denote the set of vertices identifiable at time t. By construction, the event 
{V^ = S} occurs if and only if, among all sets containing all vertices covered by 
patches, 5* is the minimal subset of V for which At{A) = whenever 1^4 \ 51 = 1. 
Thus {Vt* = S}e J^f . 

When we consider V* as a "stopping set" for a set-indexed process, it becomes 
natural to define another cx-field: 

J^y^. = {B eJ^ : Bn {Vt* = S}e J^f for all S CV} . 

Tg and Zg are jFy^* -measurable, for all < s < t. We may describe J^y* informally 
as the knowledge we have about {As}o<s<t after performing hypergraph collapse at 
each time s G [0, t\. 

Lemma 3.2. 

(i) Fix any t > 0. Pick any collection of non-negative integers {kA '■ A C V}, and 
set 

p{S) = pi n {At{A) = kA} 

\A:\A\S\>1 



(22) 
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Then 

\A:\A\V*\>\ 

(ii) Fix any t > 0. The conditional distribution of the random hypergraph Af (in 
the notation of |T6|) ). given Ty* , on the event {V^ = S}, where \S\ = m, is 
that of a Poisson((3) random hypergraph on N — m vertices with parameters 

/5i = 

For a random variable X, we write X ~ Poisson(yu) to indicate that the distribution 
of X is Poisson with expectation /i. Also we will write X ~ Binomial to indicate 
that X is a Binomial random variable with parameters n and p. 

Proof of (ji]). Certainly piV*) is jFy* -measurable. It remains to show that, for any 

B e J-y*, 

I p{y:)dP = P I 5 n fl {A,(A) = kM . 

-'^ \ A:\A\V*\>1 I 

Split the event on the right into disjoint events by intersecting with {V^ = S} for 
each S C V. For each 5*, B n {V^* = S} lies in J^^, and therefore is independent 
of {At{A) = ka} for every A such that |A \ 5*1 > 1, by construction of a Poisson 
hypergraph process. The right side becomes 

J2piS)P{Bn{v: = s}) 

scv 

which is equal to the left side; (0) follows. ■ 

Proof of dni). Suppose S CV and AcV\S with \A\ = j > 2. For any C C S with 
\C\=i, (UHl) implies that 



At{A U C) ~ Poisson tpj+iN/ 



N 
J + i 

The result of part (ji)) implies that the random variables At{A U C) are conditionally 
independent for different choices of C, given {V/ = 5} fl ^y*. 



HYPERGRAPH PROCESSES 



13 



If \S\ = m, there are (™) choices of C, and following the notation of (fTH|) . 

Af (A) = J2MAUC)^ Poisson (tN i"^) / ( Z )] ' 

CCS \ i>o W VJ + V / 

In a Poisson(/5) random hypergraph on (A^ — m) vertices, the number of occurrences 
of A, where 1^41 = j, is Poisson with parameter 

On comparison with the previous line, this verifies the formula for i3j, when 
j > 2. Clearly there are no 1-hyperedges in Af when = S}, by definition of 
identifiability. Hence (jnl) is established. ■ 

Proof of Proposition VJ . 1\ Fix any t > 0. Suppose that Tt = m. The first jump in the 
process {{Tg, Zs)}s>t can occur only when a new hyperedge arrives, and the arrival 
time is independent of the past. The law of the jump depends only on two things: 
the set A of vertices in the new hyperedge (which is independent of the past), and 
on the hypergraph Af , where 5* '= V*. Lemma l3.2l|iH) establishes that the law of 
Af , conditional on jFy^* is fully determined by m, t, and the parameters {pi}i>i; in 
particular it is conditionally independent of {{Tg, Zs)}o<s<t given that {Tt = m}. 
Hence the Markovian property of {Tt}t>o and {{Tt, Zt)}t>o is established. 

It follows from Lemma f3. 21 that the total number of non-identifiable hyperedges in 
At, given that {Tt = m}, is conditionally Poisson, with mean (A^ — m) YlPj, for (3j 
as in (j2Sl)- Write i + j, and switch the order of summation, to obtain 

On considering the Hypergeometric((A^, N — m,k)) distribution, we see that the inner 
sum is 

The last expression is zero when = 1, so (A^ — f^j takes the form (|2(Jj). When 

m — A^7 = 0{N), the last expression converges, as A^ — oo, to 1 — 7'^ — k'~f^~^{l — 7), 
and is bounded between and 1. The Bounded Convergence Theorem yields ()21|). ■ 
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4. Identifiability In Random Hypergraphs With Patghes 

In this section we review some material from Darling and Norris (2004] ) . 

Fix t > 0, and set A = A^, jSk = tp^. In this case, A is a Poisson(/?) random 
hypergraph. Suppose we perform hypergraph collapse, described above, in the fol- 
lowing special way: at each step the next vertex v to be deleted is selected with a 
probability proportional to the number of patches on v. This is called randomized 
collapse. The debris of a hypergraph is the number of hyperedges equal to the empty 
set. Set Ao '= A, and let {A„}„gN denote the sequence of hypergraphs obtained. Set 
Yn and Zn to be the amount of patches and debris, respectively, in A„; formally 

y„'='5^A„(M), and zJ^'a^{0). 

The key observation in Darling and Norris (2004] ) is that {{Yn, Zn)}nm is a Markov 
chain (but not the same one as in Proposition 13. for here t is fixed!), which stops 
at 

(24) T = inf {n : F„ = 0} . 

Moreover, conditional on {Yn = m, Zn = k}, 

Zn+l = k + l+ Wn+1 , 

(25) 

Yn+l = m - 1 - Wn+l + Un+l ■ 

Here Wn+i and f/n+i are independent, with 

1 



(26) 



Wn+i ~ Binomial m — 1 



N -n 



Un+i ~ Poisson {{N -n- l)t\2{N, n)) 

where 

(27) ^=W") = A'Ep...(")/(,i^2). 

By construction, T = \V*\, the number of identifiable vertices, and Z =^ Z^ *== 
Aj'{0) is the number of identifiable hyperedges. For comparison, note that, by Propo- 
sition 13.11 the number of non-identifiable hyperedges in A, given that T = A^7, is 
conditionally Poisson, with mean 

(28) iv(t-/?(7)-(l-7)/3'(7)) + o(Ar). 
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We obtained a limit theorem for =^ and =^ ^Z, where Z is the 

number of identifiable hyperedges. We state the result in a simple case. Set 



f3{x) = Y,hx\ xe[0,l]. 



def 

k 

Assume that /3i > and that the derivative (3'{1) < oo. Then 



(29) {x G [0, 1) : I3\x) + log(l - a;) < 0} 

is non-empty, and its infimum is g{t), as defined in By our assumption there 
is at most one x G [0,5'(t)) such that (3'{x) + log(l — x) = 0, namely git—); this is 
different to g{t) only if t G S, the set of discontinuity points of the lower envelope g. 

Let T be a random variable taking values g{t) and g{t—), each with probability 
1/2. As a special case of of Darling and Norris (2004] Theorem 2.2) we know: 



Theorem 4.1. The following limit in distribution holds as N ^ oo: 
(30) (f ^, Z^) (f, (3{f) - (1 - f ) log(l - f )) . 



Remark 4.1. Goldschmidt and Norris (2002| ) have shown that the limit for the rescaled 



number of identifiable hyperedges can be decomposed as follows: (1 — f ) log(l — f ) 
counts the essential hyperedges, i.e. those whose absence would have reduced the set 
of identifiable vertices, and (3{T) counts the remainder. 

Remark 4.2. Suppose in particular that A =^ A^ and l3{x) =^ tp{x) for some t EE, the 
discontinuity set of g. Then (jHUj) implies that the proportion of identifiable vertices 
has a limit in distribution which is random, taking the values g{t) and g{t—) each 
with probability 1/2. 

Remark 4.3. It suffices to derive the limit for , since the limit for Z^ follows from 
Proposition l3.1[ To check this, recall that, by (j221), if converges to git), then the 
number of identifiable hyperedges, divided by A^, converges to 

(31) t{pigit)) + [l-git)]p'igit))}. 



However by definition of git), tp'igit)) = — log(l — git)), so we have recovered the 
formula /?(f ) - (1 - f ) log(l - f ). 
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5. Identifiability In Hypergraph Processes With Patches 

In this section we move from the static random hypergraph model of Theorem 14.11 
to the Poisson(p) hypergraph process {At}t>o, providing here a proof of Theorem ^ 

Extending the notation of the previous section, let T/^ and Z^^ denote the rescaled 
numbers of identifiable vertices and hyperedges for At, respectively, as defined in ((T)). 
Note that 1 1— > T/^ and t i-^ are increasing, right- continuous, stochastic processes. 
It follows from Proposition 13. II that {{Tf^ , Z^)}t>Q is a Markov process. 

Proof of Theorem^ Fix < ti < . . . < t,.. We have to show the convergence in 
distribution 

(32) {(r,f , Z^)h=,_r {(T„ 4)}.=l,...,r • 

It suffices to do so when at least one of {tj, tj+i} is not a discontinuity point, for every 
2 G {1, . . . , r — 1}. Proposition 13 . II showed that {{T^^ , Z^)}t>o is Markov, and for any 
Markov process {Yt}t>o the conditional law of Yt^ given (y^^, . . . ,Yt^_J is the same 
as the conditional law given Yt^_-^. Hence it suffices to consider the case r = 2 such 
that ti ^ H or t2 ^ S, and these possibilities are both subsumed in the case r = 3 
with ti,t3 ^ S. Then only the marginal limit at time t2, as given in Theorem 14. II is 
random, so Theorem 14. II implies the full convergence in distribution. 

The second assertion follows from the first since all processes are increasing, and 
the limit is deterministic and continuous on /. ■ 

Remark 5.1. The rescaled number of essential hyperedges, as studied by |Goldschmidt and Norris (2002] ) 
has a limit { — (1 — Tt) log(l — Tt)}t>o in the same sense as and (fTn|l . 

Remark 5.2. One may ask whether the convergence extends to weak convergence 
in the Skorohod space D{[0, cxd), M^). Since t ^ T/^ and t i— > Z^^ are non- decreasing, 
the necessary and sufficient condition of Jacod and Shiryaev (1987[ p. 306) may be 



applied, which would require that the sum of squared jumps of {Tf^} converges in 
law to the sum of squared jumps of {Tt}, and similarly for {Z^}. Unfortunately the 
techniques presented in this paper do not seem to be able to confirm this; indeed, 
it seems plausible that, for arbitrarily large N, and for t G S, there is a probability 
bounded away from zero that makes more than one jump in going from ^ g{t—) 
to ~ g{t) at time s ~ t, and this would contradict the condition stated. 



Remark 5.3. If © is false, one can reformulate the process (|7j), by consulting Darling and Norris (2004] 
Theorem 2.2) and prove a corresponding version of Theorem ^ 
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6. Domain Of A Vertex In A Hypergraph Without Patches 

We revert to the fixed-time setting of Section El Suppose A is a Poisson(/3) random 
hypergraph, such that 



/?o = A = < /32 , (3{x) = J2f^kx' , xe[0, 1] 



def 

k>2 



Fix a vertex vq. Write T for the number of vertices in the domain of Vq, and 
write for the number of hyperedges identifiable from vq. Set =^ A^-^T^ 
and =^ N^^Z^ . Both the microscopic variables {T^,Z^), and the macroscopic 
variables (T^, Z^) have non-trivial limits as A — > oo, which we now describe. The 
coefficient (32 plays a distinguished role. 

Lemma 6.1. Let {Cn}ngN be a random walk on the integers, started at = 1? whose 
increments are of the form ^„ — = — 1 + Poisson(2/?2)- Let Lp be the largest root 
in [0, 1] of 2P2X + log(l — x) = 0, so (f = for 2/?2 < 1, and < ip < 1 otherwise. 
Then the first passage time to 0, 

(33) M = inf {n > : Cn = 0} , 

has the following distribution: 

p(M = n) = (2^2^)""^ /n\ , n G N ; 

(34) 

P(M = 00) = ^, 

Remark. M is distributed as the total number of individuals in a branching pro- 
cess with one ancestor, and Poisson(2/52) offspring distribution. This distribution de- 



scribes the sizes of small components in an Erdos-Renyi random graph; see BoUobas (2001 ). 



Proof. The fact that P{M = 00) = (y9 is an elementary fact from the theory of 
branching processes. The formula for P{M = n) is a special case of a formula of 
Dwass (19691, which is proved in detail on p. 300 of |Devroye (19981 ). ■ 



Assume that /3'(1) < 00. Then the set ^I^i is non-empty, and its infimum is 
g =^ g{t), as defined in Q. Assume further that l3'{x)+log{l—x) > for all x G (0, g). 
If either of these assumptions fail, then the techniques of |Darling and Norris (2004| ), 
combined with some arguments given below, still establish the desired asymptotics. 
We omit the details. 
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Set 

rF, del ^ 

J- — gi-{M=oo} ] 

(35) 

Z [^{g) -{l-g) log(l - g)] 1{m=oo} • 

Theorem 6.2. Consider a Poisson random hypergraph without patches, and fix a 
distinguished vertex vq. The number of vertices in the domain of vq, and number of 
hyperedges identifiable from vq, obey the following limits in distribution as N ^ (yo: 

(36) (T^, Z^) ^ (M, M) ; (f ^, Z^) ^ (f , Z) . 



Here M is considered as a random variable taking values in the one-point compacti- 
fication NU {oo} o/M. 

Proof. Step I. Set Aq =^ A + l{vo}, and let {A„}„gp^ be a sequence of hypergraphs 
obtained by randomized collapse. Denote by and the numbers of patches and 
debris, respectively, in A„. Then 

mf{n > : = 0} ; Z^^ . 

We know that {{Y^ , Z^)}n>Q is a Markov chain, starting from (1,0): the incre- 
ments, conditional on = m > 1 and Z^ = k, are as given in and (jSHl)- 

For fixed n > and m > 1, the random variable Wn+i defined in fl26j) converges 
to in distribution as — > oo. Also 



(37) {N -n-l)\2{N,n) ^2p2. 

so the random variable f/^+i defined in (j26|) converges to Poisson(2/32) in distribution 
as A^ oo. Hence, for all n > 0, 

{(Yf , Zf)}o<j<n > {{^jJ)}o<j<n 

which implies (T^, Z^) ^ (M, M) as A^ ^ oo. If 2/32 < 1, then P(M = oo) = 0, 
so the proof is complete. It only remains to prove the second convergence assertion 
in the case where 2/^2 > 1, and < ip < 1. 
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Step II. Introduce an auxiliary time variable t, and let {i't}t>o be a Poisson process 
of rate N. Set 





def 






def 






def 






def 


inf{t > : = 0} 



With reference to Darling and Norris (2004) ), set 

y(t)"='(l-t)(/5'(t) + log(l-t)); 
z(t)'='/?(t)-(l-t)log(l-t). 

By Theorem 6.1 and Remark 6.2 of [Darling and Norris (2004), for all 5 > 0, 



(38) limsup^logfpf sup||(z/f,F,^,Zf)-(t,y(t),^(t))||>5) | <0 

Observe that = T^, which will have the same limit in probability as does . 
We will show that, for all Q G (log(l — 0), there exists 5 > and such that 

(39) P(f ^ < 5) < e^ for all > A^q . 

By f|34j) and the fact that — ^ M, we know that, for all 5 > and all ^p' > ip: 

for all sufficiently large A^. Also from ()38p we obtain, for all 5 > 0, 

Pif" e{6,g-6)U{g + 6,^)) ^0 

as N ^ oo. Hence the claim that (T^, Z^) — ^ (T, Z) will follow as soon as we have 
proved then ^ will strengthen this to show (f ^, Z^) (f , Z). 

Step III. The remainder of the proof is to establish (j39|) . Given = m> 1, set 

$^(m, n) = Eexp {e{-l - W^+i + f/„+i)} 

= exp + F (^m - 1, -^^ + G{{N -n- 1)\2{N, n), i 

where 

F(A;,p, ^) = fclog (1 - p + pe') ; G(/i, 0) = /x(e' - 1) . 
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Lemma 6.1 of Darling and Norris (2004] ) implies that 



sup 

n<N/2 



(AT - n - l)A2(iV, n) - (l - ^) f3"{n/N) 



as N oo. Since 6 > log(l — (p), there is ip < (p such that 6 > 9 log(l — f)', by 
construction of (p, 2(32P + log(l — (^) > 0, so 2/32(1 — e^) + ^ > 0; in other words, 

exp{-^ + G'(2/32,^)} < 1. 
We can therefore find 5 > and A^o such that 

(40) $^(m, n) < 1 , for all m, n < N6, for all > A^o • 
Consider the martingale 

\fc=0 

and set = mi{n > : > NS}. It follows from ^ that, on the event 
{T^ <R^ A N6}, 

Mtn > 1 , for all N> No. 

Hence for > A^o, 

> EMo = = EMtn^rn^^s > P{T^ < A NS) . 

However ^ implies that, for 6 < g/2, P{R^ < < N6) 0, and ^ follows. 

■ 

7. Identifiability In Patch-Free Processes 

We now focus on the case of patch-free hypergraph processes, proving in this section 
Theorem |21 

7.1. A Coupled Family of Random Walks. Let {Pt{n)}t>Q, n G N, be a family 
of independent Poisson processes, all of rate 2p2 > 0, and consider the coupled family 
of random walks {^t(n)}„>o, for t G M+, where ^t(O) = 1 for all n, and 

(41) Un + 1) = Un) + {Pt{n + 1) - 1)1{„,<m.} ; 

def 



(42) Mt = mf{n > : ^(n) = 0} G N U {oo} 
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The marginal law of Mj is given by dSl with (32 = tp2- There is a relation between 
{^t(n)}„>o and the multigraph structure function: since g2it) is the largest root in 
[0, 1] of 2tp2X + log(l — x) = 0, we have as a special case of (jHll): 

Lemma 7.1. The first time t at which {'Ct(^)}n>o escapes to infinity is related to the 
multigraph lower envelope f|T^ as follows: 

(43) P(M, = oo) = g2{t) . 

Moreover t h-» Mt is an increasing process by the coupling, so x == inf{t > : Mj = 
oo} is a continuous random variable with distribution function g2{t). 

7.2. Notation. We finally turn to the case of a Poisson(p) hypergraph process 
{At}t>o without patches, i.e. such that 

def def „ / \ def V ^ L -l^ 

po = Pi = < p2 , p{x) = 2_^pkX , X G [0, IJ . 

k>2 

Write T/^ for the number of vertices in the domain of in A^, and write Zf^ for 
the number of hyperedges identifiable from vq in Af. Set 7)^ =^ N^^T^^ and Zf^ =^ 
N~^Z^ . Using (021), we define what will turn out to be the macroscopic limits for 
Theorem |21 

Tt '= g{t)i{Mt=oo} ; 

Zt = {tp{g{t)) - [1 - g{t)] log(l - g{t))} l{M.=oo} . 
Proof of Theorem\^ 

Step I. Extending the notation of Theorem 16.21 let Kt{n) denote the hypergraph 
that results from applying n steps of randomized collapse to A^ + l{i,o}; Y^^ {n) and 
Z^ in) count the number of patches, and the amount of debris, respectively in Aj(n), 
and n is assumed to satisfy: 

n < = inf{n > : Y^^ {n) = 0} . 

Consider a finite set of time points Q < ti < . . . < t^. The hypergraph collapses of 
A(j + . . . , Af^ + l^y are coupled together as follows: perform the (ra + l)st step 
of randomized collapse by choosing a patch uniformly at random from the smallest 
unstable hypergraph. Poisson symmetries imply that this amounts to randomized 
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collapse for each of the unstable hypergraphs. Condition on the event: 

r 

(44) f]{Y,^{n) = m,, Z^{n) = h}. 

1=1 

For i such that = 0, evidently Y/\^{n + 1) = and Z^{n + 1) = fc,. For those i 
such that mj > 1, we may write: 

Y,fin + 1) = m,, - 1 - (n + 1) + f/,^(n + 1) ; 
Z,^(n + 1) = A;, + 1 + W^,f(n + 1), 

where the random increments are distributed as follows. Take q to be the least 
i G {1, 2, . . . , r} for which > 1, and take W/^{n + 1) and Ul^{n + 1) independent 
such that 

Wf{n + 1) ~ Binomial ( rrig — 1, ^ 



N -n 



(45) 

Ut^^{n + 1) ~ Poisson {{N - n - l)tgX2{N - n)) , 

where \2{N,n) is as in (j2Zj). Because of the coupling, we may take subsequent 
increments (for i = g, . . . , r — 1) to be independent and of the form: 

^{n + 1) — W^{n + 1) ~ Binomial ^mj+i — rrii, — ^ ; 

Uu+M + 1) - ^uin + 1) ~ Poisson {{N-n- l)(t,+i - U)X2{N, n)) . 

Step II. Observe that the behavior of X2{N,n) depends on whether n '= 0(1), or 
n '= 0{N). It follows from (jHTj) and the calculations in Step I that, conditional on 
(gH), the joint law of 

((y,f (n + 1), Z^^in + 1)), . . . , (y,f (n + 1), Z^^ {n + 1))) 

converges as oo to the conditional law of 

((6,(n + l),A;i + l),...,(6.(n + l),fc, + l)) 

given that ^ti{n) = mi, . . . ,^tr{^) = ^r- Evidently Zl^{0) = for all i. Since n was 
arbitrary, and since for each t both {^t{iT-)}n>o and {{Y^^{n), Z^^ {n))}n>o are Markov, 
we have now proved convergence in distribution as ^ oo: 

{{Y,f{n), Z^^in)), . . . , (F,f (n), Z^^ (n))}„>o 

^ {{^M,nA MiJ, . . . , {^M, n A Mi,.)}„>o . 
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In particular, in the notation of and Section \7.2\ 
(46) ((T,^,Z,^),...,(T,f,Z,^)) ^((M,,,MO 



(M,.,M,J) 



Step III. To prove ((13)) it suffices, in the hght of (jlEI), to prove tightness of 
{{Tf, Z^)}t>o with respect to the Skorohod topology of D{[0, oo), (N U {oo})^). On 
(N U {oo})^, we shall use the metric 



d{{m,n), {p,q)) 



def 



max 



1 

m 



understanding that l/oo = 0. We shall verify the condition of Aldous for tightness of 
{(T4^,Zf )}i>o, as stated in [Billingsley (1999D , p. 176, or [Kallenberg (2002| ), p. 314, 
with respect to this metric. Since s i— > and s i-^ are non-decreasing processes, 
the condition takes a slightly simpler form than usual: it suffices to show that, for 
each e > and f] > 0, there exist h and A^o such that for every bounded sequence of 
optional times with respect to {{Tf , Z^)}t>o, and for every > Nq, 



(47) 



P 



^max 



TV 



1 

tF 



1 

zF 



> e <r] . 



where a is short for in the subscripts. 

Proposition 13. II established that {(T^^, Zl^)}t>o is a Markov process. By the strong 
Markov property, the conditional law of Tj^^ — , given that = m =^ m^, and 
Z^ = g^, is that same as that of the number of identifiable vertices in a Poisson(/3) 
random hypergraph on iV =^ A^ — m vertices, where by the reasoning of Lemma 
]21and the fact that pi = 0, 



A dcf hN ^ 

A = -T-j > Pk 

IS — m ^-^ 



k>2 



m 
k-1 



N — m 
1 



Suppose e > and i] > are given. In the case where min{m^, g^} > 1/e, it follows 
that 



(48) 



max 



1 



rpN 



1 



1 



yN 



On the other hand, if < 1/e, then 

ft/V<AiV=^p.('M/( 

k>2 ^ ^ 



1 

'zF 



2hp2 



< e . 
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Choose No so large that, for N > Nq, the right side is not more than 3hp2/e; now it 
is true that, for any 

-elog(l -7/) 

3p2 

and for any N > Nq, the probabihty that has no patches, and hence no identifiable 
vertices nor identifiable hyperedges, is at least 1 — 77; in that case, = and 

and Z^_^_f^ = . In summary, for such and h, ipTjl holds. Hence {(T/^, Z[^)}t>o is 
tight, and (|Ti|l follows. 

Step IV. As for (fT3j) we need only check the convergence of finite-dimensional 
distributions, i.e. that 

(49) (mf , Z^), . . . , (f,f , Z^J) ^ (m„ Z,J, . . . , (fuJu)) ■ 

for every finite set of time points < ti < . . . < t^- For the case r = 1, the validity 
of ()49j) follows from Theorem l(i.2[ For the sake of brevity, restrict our discussion of 
the case r > 1 to the T component; the argument for the Z component is similar. It 
suffices to show, for all g = 2, . . . , r, and all e > 0, that 



(50) 



P{Mt^_, < 00 = Mt J . 



n {fl!<e}n{\f,^-g{t,)\<e} 

\l<j<i3<i<r / 

By our knowledge of the finite dimensional distributions from Theorem 16. 2| the 
left side of is well approximated by 

l-P{Tl^>e)-P{T,^<g{t,)-e) , 
and for e sufficiently small, this converges to the right side of (j^DI)- B 

8. Future Directions 

We have not explained here the role of the upper envelope (jH), even though it was 
included in the classification of structure functions. It is related to dual hypergraph 
collapse and the size of the core, as in Cooper (2002| ). We shall give the corresponding 



asymptotic results in a future paper. 
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